Client Report - What’s in a Name?

Unit 1 Task 2

Author

Shengjian Zhou

Show the code
import polars as pl
import numpy as np
from lets_plot import *

LetsPlot.setup_html(isolated_frame=True)
Show the code
# Learn morea about Code Cells: https://quarto.org/docs/reference/cells/cells-jupyter.html

# Include and execute your code here
df = pl.read_csv("names_year_csv.csv")

QUESTION 1

How does your name at your birth year compare to its use historically? Your must provide a chart. The years labels on your charts should not include a comma.

According to the graph, we can see that the name “John” hits its peak in the year 1947 and starts to decay in the year 1964.

Show the code
# Q1
# Get the necessary column and filter the name "John."
john_total = df.select(["name", "year", "Total"]).filter(pl.col("name") == "John")

# Draw the plot.
(
  ggplot(john_total,aes(x="year", y="Total"))
  + geom_line()
  + labs(
    x="Year",
    y="Total numbers of the name \"John\"",
    title="The total birth number of the name \"John\" as the time goes by",
  )
)

QUESTION 2

Think of a unique name from a famous movie. Plot the usage of that name and see how changes line up with the movie release. Does it look like the movie had an effect on usage?

According to the graph, after the movie The Little Mermaid was released, the name “Ariel” had a sharp growth, which indicates that it had an effect.

Show the code
# Q2
# Get the necessary column and filter the name "Ariel."
ariel_total = df.select(["name", "year", "Total"]).filter(pl.col("name") == "Ariel")

# Draw the plot.
(
  ggplot(ariel_total, aes(x="year", y="Total"))
  + geom_line()
  + geom_vline(
      xintercept=1989,
      linetype="dashed",
      color="#4DA3FF",
    )
  + geom_text(
      x=1987,
      y=3000,
      label="The Little Mermaid release year",
      angle=90,
      color="#4DA3FF",
  )
  + labs(
    x="Year",
    y="Total numbers of the name \"Ariel\"",
    title="The total birth number of the name \"Ariel\" as the time goes by",
  )
)

QUESTION 3

Mary, Martha, Peter, and Paul are all Christian names. From 1920 - 2000, compare the name usage of each of the four names in a single chart. What trends do you notice?

From 1920 to 2000, all four Christian names peaked around the mid-20th century and then declined, with Mary showing the most dramatic drop, suggesting a cultural shift away from traditional religious names over time. (OpenAI 2026)

Show the code
# Q3
# Get the necessary column and filter the names.
christian_names = df.select(["name", "year", "Total"]).filter((pl.col("name").is_in(["Mary", "Martha", "Peter", "Paul"])) & (pl.col("year").is_between(1920, 2000)))

# Draw the plot.
(
  ggplot(christian_names, aes(x="year", y="Total", color="name"))
  + geom_line()
  + labs(
    x="Year",
    y="Total baby names",
    color="Name",
    title="The trends from 1920-2000"
  )
)

References